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In [T], we proved the asymptotic achievability of the Cramer-Rao bound in the compressive 
sensing setting in the linear sparsity regime. In the proof, we used an erroneous closed-form expres¬ 
sion of atr^ for the genie-aided Cramer-Rao bound cr^Tr(AJAx)“^ from Lemma 3.5, which appears 
in Eqs. (20) and (29). The proof, however, holds if one avoids replacing (T^Tr(AJAx)“^ by the 
expression of Lemma 3.5, and hence the claim of the Main Theorem stands true. 

In Chapter 2 of the Ph. D. dissertation by Behtash Babadi [2], this error was fixed and a more 
detailed proof in the non-asymptotic regime was presented. A draft of Chapter 2 of [2] is included 
in this note, verbatim. We would like to refer the interested reader to the full dissertation, which is 
electronically archived in the ProQuest database [2] , and a draft of which can be accessed through 
the author’s homepage under: http://ece.umd.edu/~behtash/babadi_thesis_2011.pdf. 
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Asymptotic Achievabili 
Compressive Sampling 


:y of the Cramer-Rao Bound for Noisy 


1 Introduction 


In this chapter, we consider the problem of estimating a sparse vector based on noisy observations. 
Suppose that we have a compressive sampling (Please see [3] and [4]) model of the form 

y = Ax + n (1) 

where x G is the unknown sparse vector to be estimated, y G is the observation vector, 
n ^ A/’(0, G is the Gaussian noise and A G is the measurement matrix. Suppose 

that X is sparse, i.e., ||x||o = L < M. Let I := supp(x) and 

a := L/N 
P ■= M/L > 2 

be fixed numbers. 

The estimator must both estimate the locations and the values of the non-zero elements of x. If 
a Genie provides us with I, the problem reduces to estimating the values of the non-zero elements 
of X. We denote the estimator to this reduced problem by Genie-Aided estimator (GAE). 

Glearly, the mean squared estimation error (MSE) of any unbiased estimator is no less than 
that of the GAE (see [5]), since the GAE does not need to estimate the locations of the nonzero 
elements of x ( log 2 = MH(1/P) bits, where H{-) is the binary entropy function). 

Recently, Haupt and Nowak [6] and Gandes and Tao [5] have proposed estimators which achieve 
the estimation error of the GAE up to a factor of logM. In [6], a measurement matrix based on 
Rademacher projections is constructed and an iterative bound-optimization recovery procedure is 
proposed. Each step of the procedure requires 0{MN) operations and the iterations are repeated 
until convergence is achieved. It has been shown that the estimator achieves the estimation error 
of the GAE up to a factor of log M. 

Gandes and Tao have proposed an estimator based on linear programming, namely the 
Dantzig Selector, which achieves the estimation error of the GAE up to a factor of logM, for 
Gaussian measurement matrices. The Dantzig Selector can be recast as a linear program and can 
be efficiently solved by the well-known primal-dual interior point methods, as suggested in [2. Each 
iteration requires solving an M x M system of linear equations and the iterations are repeated until 
convergence is attained. 

We construct an estimator based on Shannon theory and the notion of typicality [7] that asymp¬ 
totically achieves the Gramer-Rao bound on the estimation error of the GAE without the knowledge 
of the locations of the nonzero elements of x, for Gaussian measurement matrices. Although the 
estimator presented in this chapter has higher complexity (exponential) compared to the estima¬ 
tors in and [B], to the best of our knowledge it is the first result establishing the achievability 
of the Gramer-Rao bound for noisy compressive sampling [I]. The problem of finding efficient and 
low-complexity estimators that achieve the Crame-Rao bound for noisy compressive sampling still 
remains open. 

The outline of this chapter follows next. In Section [21 we state the main result of this chapter 
and present its proof in Section |3l 

^Chapter 2 of [2]. 
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2 Main Result 

The main result of this section is the following: 


Theorem 1 (Main Theorem). In the compressive sampling model of y = Ax + n, let A be a 


measurement matrix whose elements are i.i.d. Gaussian A/'(0,1). Let be the Cramer-Rao 

bound on the mean squared error of the GAE, /i(x) := min^gx \xi\ and a and ft be fixed numbers. If 


• L/i'‘(x)/logL —>-00 as M ^ oo, 

• ||x|p grows slower than L”, for some positive constant k, 

• N > CL, where C = maxICi, (72} with Ci := + 1) and C 2 '■= (9 + 41og(/? — 1)), 

assuming that the locations of the nonzero elements of x are not known, there exists an estimator 
(namely, joint typicality decoder) for the nonzero elements of yi with mean squared error e^^^\ such 
that with high probability 


limsup II = 0 

M 


3 Proof of the Main Theorem 


In order to establish the Main Result, we need to specify the Cramer-Rao bound of the GAE and 
define the joint typicality decoder. The following lemma gives the Cramer-Rao bound: 

Lemma 2. For any unbiased estimator x of x. 



( 2 ) 


where Ax is the sub-matrix of A with columns corresponding to the index set I. 
Proof. Assuming that a Genie provides us with I, we have 



(3) 


where xx is the sub-vector of x with elements corresponding to the index set I. The Fisher 
information matrix is then given by 



(4) 


Therefore, for any unbiased estimator x by the Cramer-Rao bound [S], 
E{||x - x||2} > Tr(J-i) = Tr ((AiAx)"') 


(5) 


□ 


Next, we state a lemma regarding the rank of sub-matrices of random i.i.d. Gaussian matrices: 
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Lemma 3. Let A he a measurement matrix whose elements are i.i.d. Gaussian A/'(0,1), C 
{1, 2, • • • , M} such that \Lf \ = L and Aj he the suh-matrix of A with columns corresponding to the 
index set J. Then, rank(Aj-) = L with prohahility 1. 

We can now define the notion of joint typicality. We adopt the definition from [S]: 

Definition 4. We say an N x 1 noisy observation vector, y = Ax + n and a set of indices 
J C {1, 2, • • • , M}, with \ J\ = L, are 5-jointly typical j/rank(Aj') = L and 




ni,yf - 


N-L 
N ^ 


<6 


( 6 ) 


where Aj is the sub-matrix of A with columns corresponding to the index set J and := 

i-Aj{AyAj)-^Ay. We denote the event of 6-jointly typicality of J with y by Ej. 

Note that we can make the assumption of rank(Aj') = L without loss of generality, based on 
Lemma [3] 

Definition 5 (Joint Typicality Decoder). The joint typicality decoder finds a set of indices X which 
is 5-jointly typical with y, by projecting y onto all the possible L-dimensional subspaces spanned 
by the columns of A and choosing the one satisfying Eq. It then produces the estimate by 
projecting y onto the subspace spanned by Aj-: 


X(x) 


{(AJA, 



elsewhere. 


(7) 


If the estimator does not find any set 5-typical to y, it will output the zero vector as the estimate. 
We denote this event by Eq. 

In what follows, we show that the joint typicality decoder has the property stated in the Main 
Theorem. The next lemma from gives non-asymptotic bounds on the probability of the error 
events £)£ and Ej, averaged over the ensemble of random i.i.d. Gaussian measurement matrices: 

Lemma 6 (Lemma 3.3 of [9]). For any J > 0, 


P 



N-L 

N 



< 2 exp 


<52 iV2 
4^ A - L + 


( 8 ) 


and 


^lini^yf- 


N -L 


< J < exp — 


N — L f 'l2kei\j 


4 VSfeGi\y 

where J is an index set such that \ jj\ = L, \X r\ fL\ = K < L, rank{Aj) = L and 

N 


(9) 


5' := 5 




Proof. The proof is given in . 

□ 


4 
















Proof of the Main Theorem. Let A be the set of all x M matrices with i.i.d. Gaussian A/'(0,1) 
entries. Consider A & A. Then, it is known that [3] for all /C G {1, 2, • • • , M} such that |/C| = L, 


p(A„ax(^A^A;c) >(l + v/^ + ef) (10) 

and 

p(^Amin(^A^A/c) < {l-y/a-ef^ < exp ^ (11) 

Let C A, such that for all A G Ai°‘\ the eigenvalues of -^A'^A/c lie in the interval [(1 — y/a — 
e)^, (1 + y/a + e)^]. Clearly, we have 


^ 4 ( 4 “)) > l-2exp 



vwm \ 
) 


where Pa{-) denotes the Gaussian measure defined over A. It is easy to show that 




EAgA{/(A)} 

Pa(a^"^) 


( 12 ) 


where /(A) is any function of A for which Eaga{/(A)} exists. In what follows, we will upper- 
bound the MSE of the joint typicality decoder, averaged over all Gaussian measurement matrices 
in Ai^°‘\ Let denote the MSE of the joint typicality decoder. We have: 

ef^^ = E„|||x-x||^| 

<E„{||x(^)-x||2} + ||x||2E„{l(Ef)} 

+ ^E„{||x(‘^)-x||^I(Ej)} :=4“) (13) 

J^i 


where I(-) is the indicator function, En{-} denotes the expectation operator defined over the noise 
density, and the inequality follows from the union bound and the facts that En{I(£'o)} =: P(Eo) < 
P{E!f ) := Enmifj)} and I(Ex) < 1. Now, averaging the upper bound (defined in (fTHl) ! over 
yields: 


+ / ||x||2E„{l(Ef)}dP(A) 

•1 A, j^x 

where dP{A) and Ea denote the conditional Gaussian probability measure and the expectation 
operator defined on the set respectively. 
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The first term on the right hand side of Eq. (HI can be written as: 



(AiAx)-iAiy|^-x|[|dP(A) =E„,A{||(AiAx)-iAin||J} 

=EA{a2Tr(AiAi)-i}. 


The second term on the right hand side of Eq. (HI can be upper-bounded as 


/ ||x||2E„{l(£;f)}dP(A) < 


2||x|ll (_ 5^ N- \ 

V Aa^N-L+^Nj 


(15) 


(16) 


where the inequality follows from Lemma [5] and Eq. (1121) . 

Now, let 5' = 6N/{N — L) = for some 2/3 < C < 1. Recall that by hypothesis, /i^(x) 

goes to zero slower than Suppose that L is large enough, so that > 2(//r^(x). Then, it 

can be shown that the right hand side of Eq. (IT6l) decays faster than 1/T°, where c := C^C'o/2cr^. 
Now, by the hypothesis of A > (ISkct'^ -I- 1)L we have Co > %K,a^ therefore c > k, and hence the 
right hand side of Eq. (HU goes to zero faster than 1/T(^ as JVI^N —^ oo. 

Remark. If we assume the weaker condition that ^(x) = fiQ, constant independent of M,N 
and L, then 6 will be constant and the error exponent in Eq. (HU) decays to zero exponentially 
fast. Hence, as long as ||x ||2 grows polynomially, the whole term on the right hand side of Eq. (HU 
decays to zero exponentially fast. Therefore, the claim of the theorem holds with overwhelming 
probability, *.e., probability of failure exponentially small in L (rather than with high probability, 
which refers to the case of failure probability polynomially small in L). 

Finally, consider the term corresponding to YT in the third expression on the right hand side of 
Eq. (1141) . This term can be simplified as 


IEn,A{||x(‘^^-x||2l(Ey)} =E„,a{ 

= ]En,A| 

'||x^^)-Xy||^I(Ey)}+E, 

'||(A^Ay)-iA^Ax-xy| 

i.a| Xx\y 2'^(i?y)| 

+ IEr.aI 

Ti 

\\iAyAj)-^Ayn\\lliEj)[ 

\ 

✓ 

(17) 

+ IEii.aI 

T2 


where the first equality follows from the fact that x^'^-’ is zero outside of the index set and the 
second equality follows from the assumption that n is zero-mean and independent of A and x. Now, 
the term Ti can be further simplified as 


Ti = En.A|||(AjAj) ^Aj(AjXj -b Ax\yXx\y) - Xj||2l(Ey)| 

= E„.A{||(A>Ay)-lA>Ax\yXx\y||2l(Ey)} 

Invoking the sub-multiplicative property of matrix norms and applying the Cauchy-Shwarz inequal¬ 
ity to the mean of the product of non-negative random variables, we can further upper bound Ti 
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as: 


< Ea{||(A^Aj)-i|QEa{||A^Ax\j||J}||xx\j||X^a{i(£;j)} ( 18) 

For simplicity, let e = v^^a. Now, consider the term Ea|||(AjAj)“^|| 2 |, the average of 
the squared spectral norm of the matrix (A^Aj)”^. Since A G Ae, the minimum eigenvalue of 
j^A*jAj is greater than or equal to (1 — 2-\/2a)^, since \J\=L< 2L, and AjAj is a sub-matrix 
of A’^Aic for some /C such that J G K. and |/C| = 2L [TOl Theorem 4.3.15]. Similarly, for the 
second term, A*jAx\j is a sub-matrix of A“ I- Since \ J Ul| < 2L, the spectral norm of 
^A*jAx\j is upper bounded by (1 -I- 2-\/2a)^ — 1 = 8a -|- 4-\/2a [TUI Theorem 4.3.15], [TT]. Finally, 

since the averaging is over the same bounds hold for the average of the matrix norms. Hence, 

Ti can be upper bounded by 


Ti < 


(8a -I- 4-\/^)^ 

(1 - 272^)4 


E„y 




Note that by the hypothesis of A > C 2 L, we have a < 1/9. Hence, 2\f2M = < 1. Also, 

is similarly upper bounded by (T^EA{Tr(AjA_x)“4|]Ejj a{I(£’j)}, which can be further bounded 
by 


T 2 < 


(1 - 27^)2 

Therefore, Eq. dm) can be further bounded as 


E„.a|i(Ej)|. 


E, 


n, A I 






i-f 


(8a+4V^f 

(l-2\/2a)<‘ 


4" (i_2\/2a)2 


E, 


n,A|l(Ej)| 


^’(x) 


Now, from Lemma |U] and Eq. (I12L we have 


j^i 

< ‘^(x) ^ e„,a{i(l;j)} 

^ <^(x) ^ ( N-LfEk^x\j\xk\^-^'Y\ 

- * vEwi^-P+'^vy 


( 19 ) 


where Xk denotes the fcth component of x. The number of index sets J such that \ J m| = K < L 
is upper-bounded by {l-k){^-k)- Also, J2ke:X\J — (A — Ar)^^(x). Therefore, we can rewrite 
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(20) 


the right hand side of Eq. (US as 

L-l 

E 


■^(x) 




L-K \L-K 


M -L 


exp - 


N - L HL- K)fi‘^{x) - S' 
4 \{L — K)^'^{x) + a'^ 


E 


Pa{A“->) \K’ 

We use the inequality 


L \ fM -L 
K' 


exp 


N - L f K'fA{x) - S' 
+ 0-2 


L 

K' 


< 


exp (^'log (^)) 


in order to upper-bound the itl'th term of the summation in Eq. (Uni by 
K' 


exp L 


■ log (^) 




Vl ^/, 2 ( x )+^2 


)0 


where Co := {N — L)/SL. We define 


f{z) := Lz log - + Lz log ——^ — CqL 
z z 


/ LzfA{x) — (5' N 2 
\Lzfi'^{x) + 


( 21 ) 


( 22 ) 


(23) 


By Lemmas 3.4, 3.5 and 3.6 of [5], it can be shown that f(z) asymptotically attains its maximum 
at either z = ^ or 2 = 1, if L/r^(x)/ logL —>■ oo as —?> oo. Thus, we can upper-bound the right 
hand side of Eq. (HU) by 


<^(x) 


The values of /(1/L) and /(I) can be written as: 


L> 

exp (^max{/(l/L),/(l)}y 


/(1/L) = 2 log L + 2 + log(/3 - 1) - CoL(4rl^) ' 

\U^(X) + (T^ / 


fj.'^ix) + 0-2 


and 


/(l) = L(2 + W-l))-C„L(i^M^)l 


(24) 


(25) 


(26) 


Since Co > 2 + log(j3— 1) due to the assumption of a < 1/(9-I-4 log(/3 — 1)), both /(I) and /(1/L) 
will grow to —oo linearly as N oo. Hence, the exponent in Eq. (12411 will grow to —oo as iV —> oo, 
since ||x ||2 in '^(x) grows polynomially in L and tends to 1 exponentially fast. Let L be 

large enough such that 


'^(x) 


L. 

E oxp (max{/(l/L),/(!)}) 


where c = Now, by Markov’s inequality we have: 


< 


AM) 


- Tr(AiAi) 


-1 


> 


< 


r ^ 

L 2 


L— 


P 


(27) 





















Hence, for any measurement matrix A from the restricted ensemble of i.i.d. Gaussian matrices 

,{2a) , 

Ai b we have 



(T2Tr(AJAi)-i 


1 

< - 


(28) 


with probability exceeding 



(29) 


Now, since the probability measure is defined over the statement of Eq. 

probability exceeding 


1 — 2 exp 


1 J2H {2/m 

2 VP ) 


2 


holds with 
(30) 


over all the measurement matrices in the Gaussian ensemble A. 

Recall that the expression V Tr(AJAx)“^ is indeed the Cramer-Rao bound of the GAE. Noting 
that is an upper bound on concludes the proof the the main theorem. □ 
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